Highly Conserved Regimes of Neighbor-Base-Dependent Mutation Generated the Background Primary-Structural Heterogeneities along Vertebrate Chromosomes
نویسندگان
چکیده
The content of guanine+cytosine varies markedly along the chromosomes of homeotherms and great effort has been devoted to studying this heterogeneity and its biological implications. Already before the DNA-sequencing era, however, it was established that the dinucleotides in the DNA of mammals in particular, and of most organisms in general, show striking over- and under-representations that cannot be explained by the base composition. Here we show that in the coding regions of vertebrates both GC content and codon occurrences are strongly correlated with such "motif preferences" even though we quantify the latter using an index that is not affected by the base composition, codon usage, and protein-sequence encoding. These correlations are likely to be the result of the long-term shaping of the primary structure of genic and non-genic DNA by a regime of mutation of which central features have been maintained by natural selection. We find indeed that these preferences are conserved in vertebrates even more rigidly than codon occurrences and we show that the occurrence-preference correlations are stronger in intronic and non-genic DNA, with the R(2)s reaching 99% when GC content is approximately 0.5. The mutation regime appears to be characterized by rates that depend markedly on the bases present at the site preceding and at that following each mutating site, because when we estimate such rates of neighbor-base-dependent mutation (NBDM) from substitutions retrieved from alignments of coding, intronic, and non-genic mammalian DNA sorted and grouped by GC content, they suffice to simulate DNA sequences in which motif occurrences and preferences as well as the correlations of motif preferences with GC content and with motif occurrences, are very similar to the mammalian ones. The best fit, however, is obtained with NBDM regimes lacking strand effects, which indicates that over the long term NBDM switches strands in the germline as one would expect for effects due to loosely contained background transcription. Finally, we show that human coding regions are less mutable under the estimated NBDM regimes than under matched context-independent mutation and that this entails marked differences between the spectra of amino-acid mutations that either mutation regime should generate. In the Discussion we examine the mechanisms likely to underlie NBDM heterogeneity along chromosomes and propose that it reflects how the diversity and activity of lesion-bypass polymerases (LBPs) track the landscapes of scheduled and non-scheduled genome repair, replication, and transcription during the cell cycle. We conclude that the primary structure of vertebrate genic DNA at and below the trinucleotide level has been governed over the long term by highly conserved regimes of NBDM which should be under direct natural selection because they alter drastically missense-mutation rates and hence the somatic and the germline mutational loads. Therefore, the non-coding DNA of vertebrates may have been shaped by NBDM only epiphenomenally, with non-genic DNA being affected mainly when found in the proximity of genes.
منابع مشابه
Relation Between RNA Sequences, Structures, and Shapes via Variation Networks
Background: RNA plays key role in many aspects of biological processes and its tertiary structure is critical for its biological function. RNA secondary structure represents various significant portions of RNA tertiary structure. Since the biological function of RNA is concluded indirectly from its primary structure, it would be important to analyze the relations between the RNA sequences and t...
متن کاملThalassemic Mutations in Southern Iran
Background: Approximately 180 mutations have been described in β-thalassemia worldwide with specific spectrum in each ethnic population. This study determines the spectrum and the frequency of β-thalassemia mutations in patients with β-thalassemia trait and sickle cell-β-thalassemia. Methods: Fifteen compound heterozygous sickle cell thalassemia (SCT) and 23 β-thalassemia trait patients were st...
متن کاملThe Role of Highly Conserved Tryptophan in the Sixth Conserved Region at Substrate Specificity of α- amylase
Early in this study, an α-Amylase from Bacillus megaterium WHO (BMW) was isolated from hot springs of Ramsar (North of Iran), and its gene was cloned in E.coli. Based on its conserved sequence regions and substrate specificity, it was classified as intermediary group enzymes with the specificity of oligo-1,6-glucosidase and neopullulanase subfamilies. In the sixth conserved re...
متن کاملP-84: Characterization of Androgen Receptor Structure and Nucleocytoplasmic Shuttling of the Rice Field Eel
Background: Androgen receptor (AR) plays a critical role in prostate cancer and male sexual differentiation.Mechanisms by which AR acts and regulations of AR nucleocytoplasmic shuttling are not understood well. Materials and Methods: Degenerate PCR and RACE Cloning of AR Gene; Phylogenetic Analysis and Molecular Modeling;Real-time Fluorescent Quantitative RT-PCR; Northern Blot Hybridization;In ...
متن کاملFundamental cellular processes do not require vertebrate-specific sequences within the TATA-binding protein.
The 180-amino acid core of the TATA-binding protein (TBPcore) is conserved from Archae bacteria to man. Vertebrate TBPs contain, in addition, a large and highly conserved N-terminal region that is not found in other phyla. We have generated a line of mice in which the tbp allele is replaced with a version, tbp(Delta N), which lacks 111 of 135 N-terminal amino acid residues. Most tbp(Delta N/Del...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- PLoS ONE
دوره 3 شماره
صفحات -
تاریخ انتشار 2008